DUALITY FOR SUDOKU 



THOMAS FISCHER 

Abstract. We consider a mathematical model for the classical 
Sudoku puzzle, which we call the primal problem and introduce 
a corresponding dual problem. Both problems are constraint sat- 
isfaction models and a duality relation between them is proved. 
Based on these models, we introduce a primal and a dual opti- 
mization problem and show weak and strong duality properties. 



1. Introduction 

A Sudoku is a square consisting of a 9x9 grid which is partly pre- 
populated by numbers between 1 and 9 called the givens. The problem 
consists of finding numbers between 1 and 9 for all unpopulated cells, 
such that each row, each column and each block consists of exactly the 
numbers 1, . . . , 9. The blocks of a Sudoku partition the Sudoku square 
into subsquares of size 3x3. Each Sudoku consists of 9 rows, 9 columns 
and 9 blocks. 

In [6] we introduced a mathematical model for this Sudoku puzzle 
and called it the generalized Sudoku problem. As we are concerned 
throughout this paper with duality, we call it in Section [2] the primal 
problem. We introduce a dual problem in Section [3] and show the rela- 
tion between the primal and the dual. This relation will be established 
using a necessary solution condition developed in [B] and can be in- 
terpreted as a duality result. But the primal and the dual problem 
defined in this way do not allow to describe duality results considering 
duality gaps. Therefore we introduce in Section H] primal and dual op- 
timization problems and show how they replace the original problems. 
In section [5] we prove a weak and a strong duality property between 
the primal and the dual optimization problem. 

The primal and the dual problem are of a type, which is often called 
constraint satisfaction problem or CSP. For a description of general 
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CSPs with examples, solution techniques and applications see the sur- 
vey article of Dechter and Rossi [5J. Duality statements are standard 
properties of linear and nonlinear programs. An overview with several 
examples and applications can be found in the book of Boyd and Van- 
denberghe [1] . The linear case has been treated by Dantzig and Thapa 

i- 

Finally, we collect some basic terms and notations. Let Z denote the 
set of integers. Let the n-times cartesian product of any set be indicated 
by a superscript n, i.e., Z n denote the n-times cartesian product of Z. 
The vectors respectively 1 denote the zero respectively one vector, 
consisting of zeros respectively ones in each component. The number of 
components of these vectors is often indicated by an index. Each vector 
is considered to be a column vector. U denotes the identity matrix. 
The transpose of a vector or a matrix is indicated by a superscript T. 
The sign function is denoted by sgn. We consider the sum over an 
empty index set to be zero. The brackets with index []j denote the i th 
component of a vector contained in the brackets. The symbol ft denotes 
the number of elements (cardinality) of a finite set. 



2. The Primal Problem 

We replicate here the definition of the generalized Sudoku problem 
as introduced in [B] and call it this time the primal problem. Let n be 
an integer with n > 1. We define the sum 

n-1 



i=i 



and define a matrix A(n) with s(n) rows and n columns inductively. 
For n — 1, let A(l) denote the empty matrix, i.e., a matrix without 
entries. Assume the matrix A(n — 1) had been defined with s(n — 1) 
rows and n — 1 columns. Then we set 



/ 



A(n) 



\ 





\ 


ln-1 




Os(n.-l) 


A(n-1) } 



We extend the matrix A(n) to a matrix A with n ■ s(n) rows and n 2 
columns. The matrix A consists in the "main diagonal" of n matrices 
A(n) and the remaining values are set to zero. The matrix A depends 
on the value n, but we do not state this dependence explicitly. 
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Given the set {1, . . . , n 2 } C Z, let n be any permutation on this set, 
i.e., 

n:{l,...,n 2 }^{l,...,n 2 } 

be a permutation. We extend the notion of permutation to the matrix 
A, i.e., we define ir(A) = (a n W , . . . , a n ( n -*), where a? denotes the j th 
column of A for j = 1, . . . , n 2 . Given a permutation n on {1, ... , n 2 }, 
we define the matrix A n = ir(A), i.e., we interchange the columns of A 
according to the permutation ir. 

Definition 2.1. Let s > 1. For any point y = (y±, . . . ,y s ) T G Z s we 
write y <> if each component of y is nonzero, i.e., if yi ^ for 
i = 1,. . . ,s. 

This definition should not be confused with the expression y ^ 0, 
where only one component of y has to be nonzero. 

Given is n > 2, some permutations 7i"i, 7T2, on {1, . . . , n 2 }, some 
< k < n 2 , an index set {ii,...,ik} C {l,...,n 2 }, and givens 
g h , . . . , g ik e Z with 1 < o i; < n for Z = 1, . . . , k. 

Let 74 e(? be the k x n 2 matrix, which consists of the rows ii, . . . , iy. 
of the identity matrix U n 2, i.e., the if 1 component of the I th row of 
A eq is equal to 1 (and zero otherwise). In other words A eq is defined 
by [v4 eg a;]i = for each x = (xi, . . . , x n 2) T e Z n and Z = 1, . . . , k. 
The vector of ones l n 2 is mapped to the vector of ones lk by A eq , i.e., 
A e(? l n 2 = l fc . Define g = (g h , . . . , g ik ) T G Z fc . 

Now, we are in position to state the primal problem. 

(PP) Find iC (^^'1 '1***1 *^ Tip" 

V GZ" such that 
1 < Xi < n for i = 1, . . . , n 2 , 
A-xr-X <> for r = 1, 2, 3 and 

We restrict ourselves to this mathematical model and do not refer 
directly to the classical Sudoku puzzle. In particular, we will not inves- 
tigate the relation of this model to the Sudoku puzzle in detail. This 
had been described in [6] already. 

Another problem modeling Sudoku had been introduced by Kaibel 
and Koch [7]. Their linear model consisted of 0-1- variables and con- 
tained equality constraints. The same type of problem had been con- 
sidered by Provan [8]. Both did not consider duality properties. 
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3. The Dual Problem 

We introduce the dual problem. 

(DP) Find A G {-1, +l} n ' s(n) such that 

A^A^X <> for r = 1, 2, 3 and 

A eq Al 1 \ = 2g-(n + l)l k . 

The last condition A eq A^ X — 2g — (n + 1)1^ can be reformulated 
using a componentwise description 

[^A] il = 2 flil -(n+l)forZ = l,... > fc. 

The dual problem is closely related to the generalized sign function 
introduced in [6] . The generalized sign function is based on the classical 
sign function and is also denoted by sgn. 

Definition 3.1. Let s > 1. For any point y = (y\, . . . , y s ) T G Tf with 
y <> we define the generalized sign function sgn : If — > If by 

fsgn(yi)\ 
sgn(y) =j j 

\sgn(y s )J 

We continue with some preparing lemmas. 

Lemma 3.2. Let tx be a permutation on {l,...,n 2 } and let x = 
(xi, . . . , x n 2) T G Z" 2 , such that 1 < Xj < n for i = 1, . . . , n 2 and 
A n x <> 0. Then X = sgn(A Wl x) satisfies X G {-1, +l} n ' s(n) and 
A W A^X <> 0. 

Proof. The property A G { — 1, +l} n s ( n ) follows from A n x <> and the 
definition of sgn. Using [6, Theorem 5.1] and the equation A n l n 2 = 
we obtain 

A W A^X = A^Al 'sgn(A ni x) 

= A^(Al iS gn(A^x) + (n + l)l n2 ) - (n + 1)44^ 

<> 0, 

which is the desired result. □ 

Lemma 3.3. Let n be a permutation on {1, ... , n 2 } and let X be a point 
in {-l,+l} n ' s ( n ). Then -(n - 1) < [A^X]i < n - 1 for i = 1, . . . , n 2 . 

Proof. We divide the proof of this lemma into three steps. First we 
prove it for the matrix A(n), then for A and, finally, for A^. The first 
claim reads as — (n — 1) < [^4(n) r A]i < n — 1 for A G { — 1, +l} s ( n ) and 
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i = 1, . . . , n. We prove this claim by induction on n > 2 and start with 
n = 2. Consider 



[A{2) T \] l = [(+1 - lfA], 




for A G {—1, and i = 1,2. The induction claim is true for n = 2. 
Assume the induction claim had been proved for n — 1. Consider 



■ n-l 

E if i = 1 

-Ai_i + L4(n - l) T (A n , . . . , A g ( n )) T ]i_i, if 2 < i < n 

for A = (Ai,...,A s(n) ) T G {-l,+l} s(n) and % = l,...,n. The first 
expression satisfies — (n — 1) < EJ=i Aj < n — 1 for (Ax, . . . , A s ( ra )) T G 
{ — 1, +l} s ( n ). Using s(n — 1) = s(n) — (n — 1) and the induction 
hypothesis, 

-(n - 1) = -1 - (n-2) 

< -Ai_i + [A(n - l) T (A n , . . . , A s(n )) T ] 4 _i 

< 1 + (n - 2) 
= n — 1 

for A = (Ai, . . . , A s(n) ) T G {-1, and i = 2, . . . , n. This shows 

the induction claim. 

We extend this claim to the matrix A, which contains the matrices 
A(n) in the diagonal. Let A = (A 1; . . . , A n . s(n) ) T G {-1, +l} n ' s(n) and 
let i G {1, — , n 2 }. There exists some j G {1, — , n}, such that 

L4 T A]i = L4(n) T (A -_ 1)n+1 , . . . , Xj. n ) T }i 

and this term satisfies the desired inequality. 

The matrix is a permutation of the rows of A T and this completes 
the proof of the lemma. □ 

Lemma 3.4. Let it be a permutation on {1, ... , n 2 } and let X be a point 
in {-1, such that A^A^X <> 0. Then x = (x u x n2 ) T = 

A + (n + l)l n 2) satisfies x G Z™ 2 , 1 < < n /or i = 1, . . . , n 2 
and A n x <> 0. 

Proof. The point x consists of integer components, since all defining 
variables consist of integer components. 
Using Lemma 13.31 A£ X satisfies 

-(n-l) < [jF\]i<n-l 
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for i = 1, . . . , n 2 . Adding n + 1, yields 

2< [^A+(n + l)l n3 ] < <2n 

for i = 1 , . . . , n 2 and this shows 1 < Xj < n for % = 1 , . . . , n 2 . 
Using the equation A^l n 2 = 0, we obtain 

A n x = i(jM£A + (n + l)4rln») 
<> 0, 

which completes the proof. □ 

Usually duality results are stated in the following sense: If there 
exists a primal feasible point and a dual feasible point and the opti- 
mal values are equal, then the primal feasible point solves the primal 
problem and the dual feasible point solves the dual problem. 

Sometimes duality results are stated in another way in the literature 
(compare Chvatal [21 Theorem 5.1]): If the primal problem is solvable, 
then the dual problem is solvable and the optimal values are equal. 

The relation between the primal problem and the dual problem is 
examined in the next theorem and the formulation is of the second type. 
If the primal problem is solvable, then the dual problem is solvable and 
there exists an explicit formula for the dual solution. An analogous 
statement holds for the dual problem. 

Theorem 3.5. The following statements hold: 

(i) If x solves (PP), then A = sgn^A^x) solves (DP). 

(ii) If X solves (DP), then x = \{A^X + (n + 1)1^) solves (PP). 

Proof, (i) Assume ) T solves (PP)- Then 1 < Xj < n 

for i = 1, . . . , n 2 , A Tr x <> for r = 1, 2, 3 and A eq x = g. Let A = 
sgn^A^x), then A G {-l,+l} ns ^ and A^A^X <> for r = 1,2,3, 
by Lemma [3. 2[ Using [6l Theorem 5.2], 

AegA^X = A eq Al iS gn(A^x) 

= A eq (Al l sgn(A K1 x) + (n + 1)1„ 2 ) - (n + 1)1* 
= 2A eq x- (n+ l)l fc 
= 2g-(n + l)l k , 

i.e., A solves (DP). 

(ii) Assume A solves (DP). Then A G {-1, +l} n < n \ A nr A^X <> 
for r = 1, 2, 3 and A eq A^ x X = 2g - (n + l)l fc . Let 

^ = ^«A + (n + l)l„ 2 ), 
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Figure 1 . A solution of a 4 x 4 Sudoku 

then x G Z™ 2 , 1 < Xi < n for i = l,...,n 2 and A 7Tr x <> for 
r = 1, 2, 3 by Lemma [3. 4[ The equation 

A eq x = -(A eq A^X + (n + l)A eq l n 2) 

= -(20-(n + l)l*+(n + l)l fc ) 

= 9 

completes, that x solves (PP). □ 

Example 3.6. We illustrate Theorem Iff. 51 (i) with a Sudoku of size 
n = 4, i.e., s(n) = 6. The solution x is depicted in Fig. Ui At this mo- 
ment it does not matter how the original problem had been formulated 
and where the givens had been located. The point A = sgn(A ni x) G 
{ — 1, +l}"' s ( n ) consists of the values 

A=(-l,+l,+l,+l,+l,-l, 
+ 1,-1,-1,-1,-1,-1, 
-1,-1,-1,-1,-1,+1, 
+ 1,+1,+1,+1,+1,+1) 

and is the corresponding dual solution. This dual point describes the 
comparison of the values in two cells in the same row in Fig. Ql The 
first component —1 of X describes, that the content of cell 1 in row 1 
( which is a 3) is smaller than the content of cell 2 in row 1 ( which is 
a A). 

This example can also be used to illustrate statement (ii) of Theorem 
13.51 Defining A by the series of +ls and —Is in the example, the point 
x = -(A^X + 5 • I42) is depicted in Fig. [TJ 

The primal and the dual problem as introduced in Section [2] and [3] 
are constraint satisfaction problems and do not possess an objective 
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function. Therefore it is not possible to state properties involving du- 
ality gaps for these problems. In the next section we replace these 
problems by two optimization problems and derive duality results for 
these optimization problems. 

4. The Primal and Dual Optimization Problems 

We introduce the primal and the dual optimization problems, which 
are equivalent to the primal respectively dual problem. 

The primal optimization problem will consist of points, which have 
undefined components, reflecting empty cells in a Sudoku puzzle. We 
describe these empty cells by the token oo and we define Z^ = ZU{oo}. 

When we allow points to possess infinity components, we have to 
extend several classic notations. The addition of two numbers, where 
one or both may be infinity, is defined asoo + x = x + oo = oo + oo = oo 
for 16Z. We define the product 0-oo = oo-0 = and x-oo = oo-x = oo 
for each x G Z, x ^ 0. Based on this extended definition of addition 
and multiplication, we extend implicitly the matrix multiplication to 
matrices and vectors with possible infinity components. The token oo 
is different to any number, i.e., oo ^ x and x ^ oo for each iGZ. In 
particular, oo is unequal to zero. 

This extension reflects the meaning of oo as an undefined state. 
Something defined and something undefined creates an undefined result 
and something undefined is different to anything defined. The token 
oo has nothing to do with the commonly understanding of "infinitely 
large". It is just a placeholder for "nothing", i.e., an empty cell in the 
Sudoku square. 

In a classical Sudoku puzzle the term A Wr x <> with x E (i.e. 
some of the components of x may be unknown) describes points where 
each two known values in the same row, the same column or the same 
block are distinct. 

We continue with the primal feasible set 

F P — {x — (xi, . . . , x n 2) T e Z^ | 1 < X{ < n or x,- L = oo 

for % — 1 , . . . , n 2 , 
A nr x <> for r = 1, 2, 3 and 

A e qX = g}. 

It is easy to construct examples, where Fp is empty and examples 
where Fp is nonempty. We define the primal objective function by 
fp(x) = < i < n 2 | Xi = oo} for each x — (x±, . . . ,x n 2) T e ZJ^. 
The primal objective function is bounded from below by 0. The primal 
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optimization problem is denned by 

(PPopt) Minimize fp(x) subject to x G Pp. 

The relevance of the primal optimization problem is the equivalence 
to the original primal problem and the possible definition of solution 
methods for the generalized Sudoku problem. A common strategy for 
solving a Sudoku puzzle creates points contained in the feasible set of 
the primal optimization problem. These points consist of unpopulated 
cells and distinct values in the populated cells of each row, column and 
block. 

This type of solution algorithm had been proposed by Crook [3]. His 
algorithm defines in each step a new feasible point with a lower value in 
the objective function until a solution is reached. By definition of the 
primal optimization problem the condition lower value of the objective 
function means the new point contains at least one more populated 
cell. 

The primal optimal (minimal) value min{fp(x) \ x G Pp} of this 
optimization problem is denoted by vp and satisfies vp > 0. A point 
x G Fp with fp(x) = Vp is called a solution of the primal optimization 
problem. 

Theorem 4.1. If Fp ^ 0, then (PPopt) is solvable. 

Proof. The primal objective function fp is bounded from below by zero 
and attains only integer values. □ 

The primal optimal value vp = if and only if one (or each) solution 
) T of (PP pt) satisfies Xi ^ oo for i = 1, . . . , n 2 . We 
describe the relation between the primal problem and the primal op- 
timization problem. This relation follows in a straightforward manner 
from the definitions of the corresponding problems. 

Theorem 4.2. Let x G ZJ^. The following statements are equivalent: 

(i) x solves (PP). 

(ii) x solves (PP opt ) and the primal optimal value vp = 0. 

We proceed with the dual optimization problem. The dual feasible 
set is denoted by 

F D = {X G | A^A^X <> for r = 1,2,3}. 

By a special choice of tc^ it is possible to construct examples, where 
Fj) = 0. If the primal problem is a classical (solvable) Sudoku puzzle, 
then the dual feasible set F D is nonempty. The dual objective function 
is defined by 

f D (X) = %{l<l<k \ [A^AlX^ = 2g H - (n + 1)} - k 
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for each A G { — 1, +l} n ' s(n ). The dual objective function is bounded 
from above by 0. The dual optimization problem is given by 



The dual optimal (maximal) value max{fjj(X) | A G F D } is denoted 
by v D and satisfies t>£> < 0. A point A G F D with ,/r>(A) = vd is called 
a solution of the dual optimization problem. 

Theorem 4.3. If F D ^ 0, then (DP opt ) is solvable. 

Proof. The dual objective function fz> is bounded from above by zero 
and attains only integer values. □ 

It is possible to characterize dual feasible points in terms of a primal 
property. 

2 

Theorem 4.4. Let x G Z n , such that A nr x <> for r = 1,2,3 and 
let A = sgnlAn^) . Then A G Fr>. 

Proof. This follows from Lemma 13.21 □ 

Theorem 4.5. Let A G {-1, +l} ns(n) and x G Z™ 2 ; snc/i x = 
^(A^A + (n + l)l n 2). The following statements are equivalent: 

(i) \eF D . 

(ii) A Wr x <> for r = 1, 2, 3. 

Proof, "(i) =^ (ii)" This direction follows from Lemma [3.41 
"(ii) =^ (i)" Using the assumptions 



The dual optimal value vd = if and only if one (or each) solution A 
of [DPopt) satisfies A eq A^ x \ = 2g — (n+ 1)1^. We describe the relation 
between the dual problem and the dual optimization problem. This 
relation follows in a straightforward manner from the definitions of the 
corresponding problems. 

Theorem 4.6. Let A G {— 1, +l} n ' s ("). The following statements are 
equivalent: 

(i) X solves (DP). 

(ii) X solves (DP opt ) and the dual optimal value vd = 0. 



Maximize /d(A) subject to A G F D . 




2A nr x-(n + l)A Wr l n2 
2A. nj ,x 



<> 



for r = 1,2, 3, i.e., A G Fd- 



□ 
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5. Duality Results 

In this section we collect the classical duality statements for the 
generalized Sudoku problem, namely weak duality, duality gap and 
strong duality We start with the weak duality statement. 

Theorem 5.1 (Weak Duality). The following statements hold: 

(i) /d(A) < < fp(x) for each x G Fp and A G Fp. 

(ii) Vp> < < Vp. 

Proof. We know from the definition of the primal and the dual op- 
timization problem, that the dual objective function is bounded from 
above by zero and the primal objective function is bounded from below 
by zero. This shows (i) and implies (ii). □ 

Theorem 5.2. Let x G Fp, A G Fp and fp(x) = /d(A). Then x solves 
(PPopt) with primal optimal value vp = and X solves (DP opt ) with 
dual optimal value vp = 0. 

Proof. Using Theorem 15.11 < vp < fp(x) = fp(X) < v D < 0. This 
implies fp(x) = vp = and fp(X) = vp = 0. □ 

Based on the weak duality statement, we define the term duality gap 
by Vp — Vp, which is a nonnegative value by Theorem 15.11 The next 
theorem characterizes duality gaps. 

Theorem 5.3 (Strong Duality). Let F P ^ and F D ^ 0. The follow- 
ing statements are equivalent: 

(i) Vp = Vd . 

(ii) There exists a solution x = (x\, . . . , x n 2) T of (PP opt ), such that 
X{ 7^ oo for i = 1, . . . , n 2 . 

(Hi) There exists a solution X of (DP opt ), such that AeqA^X = 2g — 
(n + l)l fc . 

Proof. Assume (i) holds. By Theorems I4.ll and l4.3l (PPn r + ) and (DP opt ) 
are solvable. Let x be a solution of (PP opt ) and let A be a solution of 
{DP opt ). By Theorem (i) and !5.2[ fp(x) = vp = and fp(X) = vp = 0, 
i.e., (ii) and (iii) hold. 

"(ii) =^ (i)" Let x = (xi, . . . ,x n 2) T be a solution of (PP opt ), such that 
Xi ^ oo for i = l,...,n 2 . This implies fp(x) = = vp, hence x 
solves {PP) by Theorem 14.21 Define A = sgnlA^x), then A solves 
(DP) by Theorem 13.51 (i). By Theorem 14. 6[ A solves (DP opt ) with 
MA) — Vp — 0. 

"(iii) =^ (i)" Let A be a solution of (DP opt ), such that A eq A^ x X = 
2g — (n + l)lfc- This implies fp(X) — Vp — 0, hence A solves (DP) 
by Theorem 14.61 Define x = ^(A^X + (n + 1)1„2), then x solves 
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(PP) by Theorem 13.51 (ii). By Theorem 14.21 x solves (PP op t) with 



It is possible to express the preceding theorem in terms of the original 
problems {PP) and (DP). The strong duality result states, there does 
not exist a duality gap between (PP opt ) and (DP opt ) if and only if the 
primal problem (PP) is solvable respectively if and only if the dual 
problem (DP) is solvable. 
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